Visualizations and the Grammar of Graphics
MACS 30500
University of Chicago
January 9, 2017
| 1 |
11 |
9 |
7.500909 |
0.8164205 |
| 2 |
11 |
9 |
7.500909 |
0.8162365 |
| 3 |
11 |
9 |
7.500000 |
0.8162867 |
| 4 |
11 |
9 |
7.500909 |
0.8165214 |
Dataset 1
| (Intercept) |
3.0000909 |
1.1247468 |
2.667348 |
0.0257341 |
| x |
0.5000909 |
0.1179055 |
4.241455 |
0.0021696 |
|
Dataset 2
| (Intercept) |
3.000909 |
1.1253024 |
2.666758 |
0.0257589 |
| x |
0.500000 |
0.1179637 |
4.238590 |
0.0021788 |
|
Dataset 3
| (Intercept) |
3.0024545 |
1.1244812 |
2.670080 |
0.0256191 |
| x |
0.4997273 |
0.1178777 |
4.239372 |
0.0021763 |
|
Dataset 4
| (Intercept) |
3.0017273 |
1.1239211 |
2.670763 |
0.0255904 |
| x |
0.4999091 |
0.1178189 |
4.243028 |
0.0021646 |
|
Grammar
The whole system and structure of a language or of languages in general, usually taken as consisting of syntax and morphology (including inflections) and sometimes also phonology and semantics.
Grammar of graphics
- “The fundamental principles or rules of an art or science”
- Grammar of graphics - a grammar used to describe and create a wide range of statistical graphics
- Layered grammar of graphics
Layered grammar of graphics
- Layer
- Data
- Mapping
- Statistical transformation (stat)
- Geometric object (geom)
- Position adjustment (position)
- Scale
- Coordinate system (coord)
- Faceting (facet)
- Defaults
Layer
- Responsible for creating the objects that we perceive on the plot
- Defined by its subcomponents
Data and mapping
- Data defines the source of the information to be visualized
- Mapping defines how the variables are applied to the graphic
Data: mpg
## # A tibble: 234 × 11
## manufacturer model displ year cyl trans drv cty hwy
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int>
## 1 audi a4 1.8 1999 4 auto(l5) f 18 29
## 2 audi a4 1.8 1999 4 manual(m5) f 21 29
## 3 audi a4 2.0 2008 4 manual(m6) f 20 31
## 4 audi a4 2.0 2008 4 auto(av) f 21 30
## 5 audi a4 2.8 1999 6 auto(l5) f 16 26
## 6 audi a4 2.8 1999 6 manual(m5) f 18 26
## 7 audi a4 3.1 2008 6 auto(av) f 18 27
## 8 audi a4 quattro 1.8 1999 4 manual(m5) 4 18 26
## 9 audi a4 quattro 1.8 1999 4 auto(l5) 4 16 25
## 10 audi a4 quattro 2.0 2008 4 manual(m6) 4 20 28
## # ... with 224 more rows, and 2 more variables: fl <chr>, class <chr>
Data: mpg
## # A tibble: 234 × 2
## displ hwy
## <dbl> <int>
## 1 1.8 29
## 2 1.8 29
## 3 2.0 31
## 4 2.0 30
## 5 2.8 26
## 6 2.8 26
## 7 3.1 27
## 8 1.8 26
## 9 1.8 25
## 10 2.0 28
## # ... with 224 more rows
Mapping: mpg
## # A tibble: 234 × 2
## x y
## <dbl> <int>
## 1 1.8 29
## 2 1.8 29
## 3 2.0 31
## 4 2.0 30
## 5 2.8 26
## 6 2.8 26
## 7 3.1 27
## 8 1.8 26
## 9 1.8 25
## 10 2.0 28
## # ... with 224 more rows
Raw data
## # A tibble: 234 × 1
## cyl
## <int>
## 1 4
## 2 4
## 3 4
## 4 4
## 5 6
## 6 6
## 7 6
## 8 4
## 9 4
## 10 4
## # ... with 224 more rows
## # A tibble: 4 × 2
## cyl n
## <int> <int>
## 1 4 81
## 2 5 4
## 3 6 79
## 4 8 70
Geometric objects (geoms)
- Control the type of plot you create
- 0 dimensions - point, text
- 1 dimension - path, line
- 2 dimensions - polygon, interval
- Geoms have specific aesthetics
- Point geom - position, color, shape, and size
- Bar geom - position, height, width, and fill
Point geom

Bar geom

Position adjustment

Position adjustment

Position adjustment

Position adjustment

Scale
- Controls the mapping from data to aesthetic attributes
Scale: color

Scale: color

Coordinate system (coord)
- Maps the position of objects onto the plane of the plot
Cartesian coordinate system

Semi-log

Polar

Faceting

Defaults
ggplot() +
layer(
data = mpg, mapping = aes(x = displ, y = hwy),
geom = "point", stat = "identity", position = "identity"
) +
scale_x_continuous() +
scale_y_continuous() +
coord_cartesian()
Defaults
ggplot() +
layer(
data = mpg, mapping = aes(x = displ, y = hwy),
geom = "point", stat = "identity", position = "identity"
) +
scale_x_continuous() +
scale_y_continuous() +
coord_cartesian()
ggplot() +
layer(
data = mpg, mapping = aes(x = displ, y = hwy),
geom = "point"
)
Defaults
ggplot() +
layer(
data = mpg, mapping = aes(x = displ, y = hwy),
geom = "point", stat = "identity", position = "identity"
) +
scale_x_continuous() +
scale_y_continuous() +
coord_cartesian()
ggplot() +
layer(
data = mpg, mapping = aes(x = displ, y = hwy),
geom = "point"
)
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point()
Defaults
ggplot() +
layer(
data = mpg, mapping = aes(x = displ, y = hwy),
geom = "point", stat = "identity", position = "identity"
) +
scale_x_continuous() +
scale_y_continuous() +
coord_cartesian()
ggplot() +
layer(
data = mpg, mapping = aes(x = displ, y = hwy),
geom = "point"
)
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point()
ggplot(mpg, aes(displ, hwy)) +
geom_point()
Minard’s grammar
- Troops
- Latitude
- Longitude
- Survivors
- Advance/retreat
- Cities
- Latitude
- Longitude
- City name
- Layer
- Data
- Mapping
- Statistical transformation (stat)
- Geometric object (geom)
- Position adjustment (position)
- Scale
- Coordinate system
- Faceting